2,751 research outputs found

    A Faster Circular Binary Segmentation Algorithm for the Analysis of Array CGH Data

    Get PDF
    Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number (Olshen {\it et~al}, 2004). The algorithm tests for change-points using a maximal tt-statistic with a permutation reference distribution to obtain the corresponding pp-value. The number of computations required for the maximal test statistic is O(N2),O(N^2), where NN is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster. algorithm. Results: We present a hybrid approach to obtain the pp-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analysis of array CGH data from a breast cancer cell line to show the impact of the new approaches on the analysis of real data. Availability: An R (R Development Core Team, 2006) version of the CBS algorithm has been implemented in the ``DNAcopy\u27\u27 package of the Bioconductor project (Gentleman {\it et~al}, 2004). The proposed hybrid method for the pp-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher

    Effect of the Intrinsic Width on the Piezoelectric Force Microscopy of a Single Ferroelectric Domain Wall

    Full text link
    Intrinsic domain wall width is a fundamental parameter that reflects bulk ferroelectric properties and governs the performance of ferroelectric memory devices. We present closed-form analytical expressions for vertical and lateral piezoelectric force microscopy (PFM) profiles for the conical and disc models of the tip, beyond point charge and sphere approximations. The analysis takes into account the finite intrinsic width of the domain wall, and dielectric anisotropy of the material. These analytical expressions provide insight into the mechanisms of PFM image formation and can be used for quantitative analysis of the PFM domain wall profiles. PFM profile of a realistic domain wall is shown to be the convolution of its intrinsic profile and resolution function of PFM.Comment: 25 pages, 5 figures, 3 tables, 3 Appendices, To be submitted to J. Appl. Phy

    PAC-learning is Undecidable

    Full text link
    The problem of attempting to learn the mapping between data and labels is the crux of any machine learning task. It is, therefore, of interest to the machine learning community on practical as well as theoretical counts to consider the existence of a test or criterion for deciding the feasibility of attempting to learn. We investigate the existence of such a criterion in the setting of PAC-learning, basing the feasibility solely on whether the mapping to be learnt lends itself to approximation by a given class of hypothesis functions. We show that no such criterion exists, exposing a fundamental limitation in the decidability of learning. In other words, we prove that testing for PAC-learnability is undecidable in the Turing sense. We also briefly discuss some of the probable implications of this result to the current practice of machine learning
    • …
    corecore